Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdlib: make dot product of Hermitian matrices real #52318

Merged
merged 6 commits into from
Feb 3, 2024
Merged

stdlib: make dot product of Hermitian matrices real #52318

merged 6 commits into from
Feb 3, 2024

Conversation

araujoms
Copy link
Contributor

Although mathematically the dot product between Hermitian matrices is always real, LinearAlgebra stores it in a complex variable with zero imaginary part. This prevents checking whether the value is real from the type only, which causes problems in certain applications (in my case, optimization with JuMP).

Of course, there is a simple workaround of calling it as real(dot(a,b)), but it's more elegant to produce a real type in the first place.

stevengj
stevengj previously approved these changes Nov 27, 2023
@stevengj
Copy link
Member

LGTM.

It should be more efficient too, since the compiler should be able to optimize away the computation of the imaginary parts that are discarded.

(At first I thought there might be a problem with Hermitian matrices whose elements are themselves matrices, but it's okay because dot is recursive.)

@stevengj stevengj added the linear algebra Linear algebra label Nov 27, 2023
@araujoms
Copy link
Contributor Author

I hadn't thought of that. It goes from 2(d^2+d) multiplications of real numbers to 2d^2, so there should be a noticeable performance increase if someone is taking billions of inner products of tiny matrices. I did a quick benchmark here, and indeed the execution time decreased by roughly the expected amount.

@andreasnoack
Copy link
Member

Although mathematically the dot product between Hermitian matrices is always real

I think this only holds when multiplication of the elements is commutative and since Hermitian allows for arbitrary element types I guess we'd have to restrict this method to Complex.

julia> Ac, Bc = [complex.(randn(3,3), randn(3, 3)) |> t -> t + t' for i in 1:2];

julia> Aq, Bq = [Quaternion.(randn(3,3), randn(3, 3), randn(3, 3), randn(3,3)) |> t -> t + t' for i in 1:2];

julia> all(ishermitian, (Ac, Bc, Aq, Bq))
true

julia> dot(Ac, Bc)
-5.958241125745531 - 1.0340753672895848e-16im

julia> dot(Aq, Bq)
QuaternionF64(12.353238961023004, 9.185676749747854, -0.2811577397685232, -25.49656747643778)

@araujoms
Copy link
Contributor Author

I think what is needed for the dot product to be real is conj(a*b) = conj(a)*conj(b), which does not hold for quaternions. I think it would be sensible to restrict this method to Complex and let Hermitian quaternionic matrices fall back to the general case; it's such a rare use case that the perfomance penalty won't matter.

I don't really know how to restrict it to Complex, though. What I came up with is

for (T, trans, real) in [(:Symmetric, :transpose, :identity), (:(Hermitian{Complex{S}} where S <: Real), :adjoint, :real)] 

But surely there must be a more elegant way.

@stevengj
Copy link
Member

stevengj commented Nov 28, 2023

Maybe (:(Hermitian{<:Union{Real,Complex}}), :adjoint, :real) ?

(Note that we also want to allow Hermitian{<:Real}, in which case adjoint and real are identity operations but there is still a 2x speedup.)

@stevengj
Copy link
Member

stevengj commented Nov 28, 2023

Note that restricting it to complex/real Hermitian matrices will be a bug fix, since the current version also fails when a quaternionic matrix is wrapped in Hermitian. For @andreasnoack's example:

julia> dot(Aq, Bq)
QuaternionF64(-9.544231167880467, -3.8948768522921418, 26.649977520155836, 5.9007891169097455)

julia> dot(Hermitian(Aq), Hermitian(Bq))
QuaternionF64(-9.544231167880467, 0.0, 0.0, 0.0)

julia> Aq == Hermitian(Aq), Bq == Hermitian(Bq)
(true, true)

Would be good to add a test for this.

Should there be a separate bugfix PR that just corrects this, without changing the return type, so that it can be backported?

@araujoms
Copy link
Contributor Author

Thanks, that works. Indeed, we want real Hermitian matrices as well, as I often use that myself for generality. I just didn't know Real wasn't a subtype of Complex.

@stevengj
Copy link
Member

Update: I pushed a bugfix PR to #52333 so that it can be backported. We should probably merge that one first, since it will conflict with this one.

for (T, trans, real) in [(:Symmetric, :transpose, :identity), (:Hermitian, :adjoint, :real)]
for (T, trans, real) in [
(:Symmetric, :transpose, :identity),
(:(Hermitian{<:Union{Real,Complex}}), :adjoint, :real),
Copy link
Contributor

@mikmoore mikmoore Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's slightly unfortunate that this specialization prevents this from being applicable to e.g. Hermitian{<:AbstractMatrix{<:Union{Real,Complex}}}. Although I'm not sure how we might engineer a definition that's robust through multiple levels of nested AbstractMatrix so it would probably be hard to do something about this right now.

I'll also remark that this specialization should also be suitable for dot(::Symmetric{<:Real},::Hermitian{<:Complex}}, although currently it appears we miss this case. Would LinearAlgebra.RealHermSymComplexHerm be suitable here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a design flaw of LinearAlgebra. Symmetric should have been defined as real symmetric, and should be made a subtype of Hermitian, as mathematically a real symmetric matrix is just a particular case of a complex Hermitian matrix, and we'd get for free all of these particular cases without having to write a method for each specifically. Then one could also have a GenericSymmetric to deal with the more exotic cases if anyone needed them.

FWIW I did think about mixing real symmetric and complex Hermitian, but I decided against adding additional methods that were going to be rarely used. Of course, I know my opinion counts for very little here.

@araujoms
Copy link
Contributor Author

No, wait! I've just realized that this breaks my own use case! If we restrict it to real or complex Hermitians then it won't work with JuMP variables anymore, and when I use dot I'll fall back to the generic dot, which is not only two times slower but also will not give me a real type as answer as I wanted.

Using Hermitian{<:Union{Real,Complex,JuMP.GenericAffExpr}} would work, but it would make LinearAlgebra depend on JuMP, which I assume is not acceptable. Is there a way to fix that?

@araujoms
Copy link
Contributor Author

On a second thought, maybe the best solution is to not restrict it to subtypes of Complex and Real, letting it just fail for quaternions? I don't see a reason to ever wrap a quaternion in Hermitian. LinearAlgebra doesn't implement any method for it. I've also looked in Quaternions.jl and Quaternionic.jl, and ditto, there's nothing there.

That would fix my problem, @mikmoore 's problem, and potentially other cases as well of things that are not subtypes of Complex and Real but nevertheless behave well?

I also wanted to contributed a specialized kronecker product between Hermitian matrices, but this will suffer from the same issue: the kronecker product between quaternionic Hermitian matrices is not Hermitian (again because conj(a*b) != conj(a)*conj(b)). If I restrict it to subtypes of Complex and Real that will defeat the entire point, as I want to use it with JuMP.

@stevengj
Copy link
Member

stevengj commented Dec 1, 2023

On a second thought, maybe the best solution is to not restrict it to subtypes of Complex and Real, letting it just fail for quaternions?

Silently giving wrong answers doesn't seem acceptable for the standard library.

I agree that Hermitian{Quaternion} is an odd case that surely doesn't come up very often, but it's allowed by the Hermitian type and it's mathematically reasonable. (For example, a Hermitian{Quaternion} matrix has real eigenvalues and orthogonal eigenvectors. In fact, there was a recent paper about this kind of thing.)

Using Hermitian{<:Union{Real,Complex,JuMP.GenericAffExpr}} would work, but it would make LinearAlgebra depend on JuMP, which I assume is not acceptable. Is there a way to fix that?

JuMP could add a dot method for Hermitian{GenericAffExpr}.

To make it easier to extend in this way, we could implement the base method as:

dot(A::Symmetric, B::Symmetric) = _dot_hermsym(A, B, transpose, identity)
dot(A::RealHermSymComplexHerm, B::RealHermSymComplexHerm) = _dot_hermsym(A, B, adjoint, real)
dot(A::Symmetric{<:Real}, B::Symmetric{<:Real}) = _dot_hermsym(A, B, transpose, identity) # fix method ambiguity

function _dot_hermsym(A, B, trans, real)
   #... optimized method 
end

and then JuMP (or anyone else who needs the optimized method for a new type) would just need to add a 1-line method that calls LinearAlgebra._dot_hermsym.

(This also seems a lot cleaner than the macro-based approach.)

@stevengj stevengj dismissed their stale review December 1, 2023 02:45

bugfix PR should be merged first

@araujoms
Copy link
Contributor Author

araujoms commented Dec 1, 2023

It wouldn't be a one-linear because JuMP (well MutableArithmetics to be more precise) also reimplements LinearAlgebra.dot to operate efficiently on mutable objects. Nevertheless, it would be easily doable. They just won't be happy about it because literally yesterday they merged a dispatch for Hermitian objects I had been asking about: jump-dev/MutableArithmetics.jl#248 .

@odow would you be fine with this?

@dkarrasch
Copy link
Member

I see two options: (i) merge as is since this is as far as we can go for a generic stdlib, or (ii) introduce a function barrier and have an type-unrestricted "dot kernel" that can be hooked into by packages, as suggested by @stevengj. If (ii) doesn't help much in the concrete JuMP context, then I suggest to merge as is and leave an efficient mutable definition to packages.

@araujoms
Copy link
Contributor Author

araujoms commented Dec 7, 2023

I would prefer (ii); it does solve the problem for JuMP, even if its devs haven't expressed interest, and will also make it possible for other packages to use the faster version as well. Think of it as future-proofing.

@odow
Copy link
Contributor

odow commented Dec 7, 2023

@blegat is the one who knows this stuff.

My only comment: isn't changing this return type from Complex to Real breaking?

@blegat
Copy link
Contributor

blegat commented Dec 7, 2023

Why does it solve the problem for JuMP ? We won't be calling _dot_hermsym since it doesn't exploit mutability.

@araujoms
Copy link
Contributor Author

araujoms commented Dec 7, 2023

Why not? You were calling dot(A::Hermitian,B:: Hermitian) before this change. I assumed that exactly the same trick you used to call dot efficiently through operate could be applied to _dot_hermsym.

end

dotprod = real(zero(dot(first(A), first(B))))
@inbounds if A.uplo == 'U' && B.uplo == 'U'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we stop using this? This function assumes but does not enforce 1-based indexing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parents of Symmetric and Hermitian are required to be one-based at construction (see the very top of this file), so this is safe.

@odow
Copy link
Contributor

odow commented Dec 8, 2023

would just need to add a 1-line method that calls LinearAlgebra._dot_hermsym.

I know there are places where we currently call or overload LinearAlgebra._ methods, but I'd really prefer that we didn't.

@araujoms
Copy link
Contributor Author

Can we put this PR out of its misery? I'm fine with either of your options @dkarrasch , just make the call.

@dkarrasch
Copy link
Member

If this doesn't solve the issue for MutableArithmetics.jl anyway, I don't see much value in having the internal extra method. In that case, I'd suggest that third-party packages simply take the rough optimization idea from here and add whatever is specific to their case (like mutability) and overload dot directly. It's more code to have in those packages, but it's not too complicated either. That should be much cleaner in terms of extending public API. If people agree, that would mean to revert the last commit.

@araujoms
Copy link
Contributor Author

Fine by me. Can you cherry pick the commits or do I need to revert it myself?

@dkarrasch
Copy link
Member

Let's wait for some feedback, but if people agree, then you could push a simple revert commit.

@dkarrasch
Copy link
Member

In the absence of further comments, let's revert the last commit and then merge (if tests pass). If we find a use case that would benefit from having an unrestricted "kernel", we could still change it to that, but it seems like the only external use case we have would not use this kernel anyway.

@araujoms
Copy link
Contributor Author

Done. Failing test is unrelated.

@araujoms
Copy link
Contributor Author

I'm sorry to ping you again, but is there any obstruction to merging? I really want to get this over with.

@araujoms
Copy link
Contributor Author

araujoms commented Feb 1, 2024

Is there anything wrong? It's been a month.

@dkarrasch
Copy link
Member

Sorry, no, there's nothing wrong. I'll rerun CI one more time (many tests failed, but seemingly unrelated), and then merge once tests pass.

@dkarrasch dkarrasch added the merge me PR is reviewed. Merge when all tests are passing label Feb 1, 2024
@inkydragon
Copy link
Member

CI: All test passed, ASAN timeout is unrelated.

@inkydragon inkydragon changed the title make dot product of Hermitian matrices real stdlib: make dot product of Hermitian matrices real Feb 3, 2024
@inkydragon inkydragon merged commit fda9321 into JuliaLang:master Feb 3, 2024
6 of 8 checks passed
@inkydragon inkydragon removed the merge me PR is reviewed. Merge when all tests are passing label Feb 3, 2024
@araujoms araujoms deleted the patch-1 branch February 3, 2024 01:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants